Under consideration for publication in Theory and Practice of Logic Programming 



1 



Improving Prolog Programs: Refactoring for 

Prolog 



ALEXANDER SEREBRENIK 

Laboratory of Quality of Software (LaQuSo), T.U. Eindhoven 
HG 5.9 L Den Dolech 2, P.O. Box 513, 5600 MB Eindhoven, The Netherlands 
(e-mail: A.Serebrenik@tue.nl) 

TOM SCHRIJVERS* 

Department of Computer Science, K.U. Leuven 
Celestijnenlaan 200A, B-300I, Heverlee, Belgium 
(e-mail: Tom.Schrijvers@cs. ku leuven . ac .be) 

BART DEMOEN 

Department of Computer Science, K. U. Leuven 
Celestijnenlaan 200A, B-3001, Heverlee, Belgium 
(e-mail: Bart . Demoen@cs . kuleuven . ac.be) 



Abstract 

Refactoring is an established technique from the object-oriented (OO) programming community to 
restructure code: it aims at improving software readability, maintainability and extensibility. Al- 
though refactoring is not tied to the OO-paradigm in particular, its ideas have not been applied to 
Logic Programming until now. 

This paper applies the ideas of refactoring to Prolog programs. A catalogue is presented listing 
refactorings classified according to scope. Some of the refactorings have been adapted from the 
OO-paradigm, while others have been specifically designed for Prolog. The discrepancy between 
intended and operational semantics in Prolog is also addressed by some of the refactorings. 

In addition, ViPReSS, a semi-automatic refactoring browser, is discussed and the experience with 
applying ViPReSS to a large Prolog legacy system is reported. The main conclusion is that refactoring 
is both a viable technique in Prolog and a rather desirable one. 



1 Introduction 

Maintaining and adapting software takes up a substantial part of the entire programming 
effort, both in time and money. Erlikh ( 120001) and Moad (119901 ) both report on the pro- 
portion of maintenance costs exceeding 90% of the budget. About 75% of these costs 
are spent on providing enhancements (in the form of adaptive or perfective maintenance) 
dNosek and Palvia 1990Hvan Vhet 2000] l. 

Before providing enhancements, it is recommended to improve the design of the soft- 
ware in a preliminary step. This methodology, called refactoring, emerged from a number 
of pioneer results in the OO-community dPowler et al. 1999l|Opdyke 1992[|Roberts et al. 19971) 
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and recently came to prominence for functional ( Li et al. 20031 1 and procedural jGarrido and Johnson 2003l l 
languages. 

Refactoring is a disciplined technique for restructuring an existing body of code, alter- 
ing its internal structure without changing its external behavior. Its heart is a series of small 
source-to-source program transformations, called refactorings, that change program struc- 
ture and organization, but not program functionality. The major aim of refactoring is to 
improve readability, maintainability and extensibility of the existing software. 

While performance improvement is not considered as a crucial issue for refactoring, 
it can be noted that well-structured software is more amenable to performance tuning. 
We also observe that certain techniques that were developed in the context of program 
optimization, such as dead-code elimination and redundant argument filtering, can improve 
program organization and, hence, can be used as refactoring techniques. 

In this paper we study refactoring techniques for Prolog. Our goals are threefold. Firstly, 
we want to show that refactoring is a viable technique for Prolog and many of the existing 
techniques developed for refactoring in general are applicable. Secondly, Prolog-specific 
refactorings are possible and the application of some general techniques may be highly 
specialized towards Prolog. Finally, it should be clear that refactoring is not only viable for 
Prolog but also very useful for the maintenance of Prolog programs. 

In order to achieve our goals we present a catalogue of refactoring techniques for Pro- 
log. The listed refactorings are a mix of general and Prolog-specific ones. Most of the 
refactorings proposed have been implemented in a prototype refactoring browser ViPReSS. 
ViPReSS has been successfully applied for refactoring a 50,000 lines-long legacy system. 

As completeness of the catalogue is clearly not possible, we aimed to show a wide range 
of possibilities for future work on combining the formal techniques of program analysis 
and transformation with software engineering. The formal elaboration of a particular topic 
may be a substantial study on its own, as shows the work on detecting duplicate code by 
Vanhoof (2004) that was inspired by a preliminary version of our work. 

Outline of the Paper First, Section |2]provides a brief overview of the refactoring process. 
Next, the use of several refactoring techniques is illustrated on a small example in Section 
|3] Then a catalogue of Prolog refactorings is given in Section|4] In Section|5]we introduce 
ViPReSS, and discuss its application in a case study. Finally, in Section|6]we conclude. 

2 The Refactoring Process 

The refactoring process consists of applying a number of refactorings, with both localized 
and global impact, to a software system. The individual significance of a refactoring may 
be apparent, but often a refactoring seems trivial on its own and only in conjunction with 
other refactorings or intended changes does the usefulness become clear. That is the reason 
why it is not feasible to fully automate refactorings. They must be carefully considered in 
view of the programmer's intentions. 

For this reason the process of applying a single refactoring is to be split into a number of 
distinct activities dMens and Tourwe 20041 ). These activities involve decisions to be made 
by the programmer 

The first decision is where the software should be refactored. Making this decision auto- 
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matically can be a difficult task on its own. Several ways to resolve this may be considered. 
For instance, one can aim at identifying so called bad smells, i.e., "structures of the code 
that suggest (sometimes scream for) the possibility of refactoring" dFowler et al. 1999] l. To 
this end program analysis can be used. For example, it is common practice while ordering 
predicate arguments to start with the input arguments and end with the output arguments. 
Mode information can be used to detect when this rule is violated. 

Next, one should determine which refactorings should be applied. Sometimes, the cor- 
respondence between bad smells and refactorings is clear For instance, if the predicate 
arguments are not ordered according to the "input first output last" rule, one can suggest 
to the user to reorder the arguments. This refactoring is further discussed in Section |43] 
In more complex situations the relation becomes less obvious: a number of different refac- 
torings are applicable and the user has to choose between them. For example, let module 
A contain a predicate that is mutually recursive with predicate p from module B, and mod- 
ule C contain a predicate that is mutually recursive with predicate q from module B. This 
situation can be identified as problematic since no clear hierarchy can be defined between 
these modules. One possible solution would be to merge the three modules (Section|42]). 
Alternatively, one may try to first split B into Bl, containing p, and B2 containing q such 
that there are no circular dependencies between Bl and B2 (Section [4.2b . If this split is 
possible, A could be merged with Bl, and C with B2 (Section [4.2b . Automatic refactoring 
tools, so called refactoring browsers, can be expected to make suggestions on where refac- 
toring transformations should be applied. These suggestions can then be either confirmed 
or rejected by the programmer. 

By definition, refactorings should preserve the software's functionality. Hence, the next 
step consists of ensuring that the behavior is indeed preserved. This step, of course, de- 
pends on the definition of behavior In the case of logic programming, behavior comprises 
computed answers semantics, termination, and side effects such as input/output. It should 
be observed that particular application domains might require extending the notion of be- 
havior to include such concepts as efficiency or memory use. Moreover, in order for some 
refactorings to be applicable certain preconditions should hold, like absence of user-defined 
meta-predicates for dead-code elimination discussed in Section [4~n Sometimes verification 
of the preconditions cannot be done automatically, but must be delegated to the user 

Subsequently, the chosen transformation is applied. This step might also require user 
input. Consider for example a refactoring that renames a predicate: while automatic tools 
can hardly be expected to guess the new predicate name, they should be able to detect all 
program points affected by the change. This refactoring is further studied in Section [43] 

Finally, the consistency between the refactored program code and other related artifacts 
should be maintained. By artifacts we understand among others software documentation, 
specifications and test descriptions. The ability to perform this task automatically strongly 
depends on the formalisms used to express the corresponding artifacts. For instance, doc- 



umentation generators such as Ipdoc ( Hermenegildo 2000 1 make it possible to keep the 



documentation consistent automatically, whereas ad hoc unstructured comments are much 
harder to update automatically. Ensuring consistency is considered as future work. 
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3 Detailed Prolog Refactoring Example 

We illustrate some of the techniques proposed by a detailed refactoring example. Con- 
sider the following code fragment from O'Keefe's "The Craft of Prolog" ( 1994), p. 195. 
It describes three operations on a reader data structure used to sequentially read terms 
from a file. The three operations are make_reader/3, which initializes the data structure, 
reader_done/l, which checks whether no more terms can be read, and reader_next/3, 
which gets the next term and advances the reader 

I Listing 3.1 - O'Keefe's original version . 

make_reader (File, Stream, State) : - 
open (File, read, Stream) , 
read (Stream, Term) , 
reader_code (Term, Stream, State) . 

reader_cocie (end_of_f lie, _, end_of_f lie) :- ! . 
reader_code (Term, Stream, read (Term, Stream, Position) ) : - 
stream_position (Stream, Position) . 

reader_done (end_of_f lie) . 

reader_next (Term, read(Term, Stream, Pos) , State) ) :- 
stream_position (Stream, _, Pos) , 
read (Stream, Next) , 
reader_code (Next, Stream, State) . 



We will now apply several refactorings to the above program in order to improve its 
readability. 

Firstly, we use if-then-else introduction (Section I4.4l i to get rid of the red cufl in the 
reader_code/3 predicate (modified code is underlined): 

I Listing 3.2 - Replace cut by if-then-else . 

reader_code (Term, Stream, State) :- 
( Term = end_of_file, 
State = end_of_file -> 
true 

}_ 

State = read (Term, Stream, Position) , 
stream_position (Stream, Position) 

) . 



The result of this automatic transformation reveals two malpractices: the first is produc- 
ing output before the commit, something O'Keefe himself disapproves of in (I1994I I. This 
malpractice and the ways to resolve it are further investigated in 14.41 The problem is fixed 
to: 

I Listing 3.3 - Output after commit . 

reader_code (Term, Stream, State) : - 
( Term = end_of_file -> 

State = end_of_file 

I 

State = read (Term, Stream, Position) , 



' As defined in e.g. jO'Keefe 1994) : a cut that alters the meaning. 
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stream_position (Stream, Position) 

) . 



The second malpractice is a unification in the condition of the if-then-else where an 
equality test is meant. Consider the case that the Term argument is a variable. Then the 
binding of Term to the atom end_of_f lie is certainly unwanted behavior. The transfor- 
mation in question is discussed in Section 14.41 The following code does not exhibit the 
problematic behavior: 

I Listing 3.4 - Equality test . 

reacier_code (Term, Stream, State) :- 
( Term == end_of_file -> 

State = end_of_file 

State = read (Term, Stream, Position) , 
stream_position (Stream, Position) 

) . 



Next, we notice that the conjunction read/2, reader_code/3 occurs twice. By apply- 
ing predicate extraction (Section l4~4b of this common sequence, we get: 

I Listing 3.5 - Predicate extraction . 

make_reader (File, Stream, State) : - 
open (File, read, Stream) , 
read_next_state (Stream, State) . 

reader_next (Term, read(Term, Stream, Pos) , State) ) :- 
stream__position (Stream, _, Pos) , 
read_next_state (Stream, State) . 

read_next_state (Stream, State) :- 
read (Stream, Term) , 
reader_code (Term, Stream, State) . 



Next we put the input argument first and the output arguments last ( S ec tion |43] below) . 
a principle also advocated in dO'Keefe 1994| l: 

I Listing 3.6 - Argument reordering . 

reader_next (read (Term, Stream, Pos) , Term, State) :- 
stream_position (Stream, _,Pos) , 
read_next_code (Stream, State) . 



Finally, note that the naming of the two builtins stream_position/ [2,3] may be con- 
fusing to the user. It is easier to distinguish between their functionality based on predicate 
name than based on arity. We introduce the less confusing names get_stream_position/2 
and set_stream_position/3 respectively. In addition, we provide a more consistent nam- 
ing for make_reader, more in line with the other two predicates in the interface. The im- 
portance of consistent naming conventions is also stressed in (lO'Keefe 19941 ). 

Note that direct renaming of built-ins such as stream_position is not possible, but a 
similar effect can be achieved by extracting the built-in into a new predicate with the de- 
sired name. Extracting a predicate and renaming predicates are considered in Sections [4.41 
and 14.31 respectively. 
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In order to avoid confusion between a built-in predicate read and a functor read we 
rename the latter functor to reader. 

I Listing 3.7 - Renaming . 

reader_init (File, Stream, State) :- 
open (File, read, Stream) , 
reader_next_state (Stream, State) . 

reader_next ( reader (Term, Stream, Pos) , Term, State) ) : - 
set_stream_position (Stream, Pos) , 
reader_next_state (Stream, State) . 

reader_done (end_of_f lie) . 

reader_next_state (Stream, State) : - 
read (Stream, Term) , 

build_reader_state (Term, Stream, State) . 

build_reader_state (Term, Stream, State) :- 
( Term == end_of_file -> 

State = end_of_file 

} 

State = reader (Term, Stream, Position) , 
get_stream_position (Stream, Position) 

) . 

set_streain__position (Stream, Position) : - 

stream_position (Stream, _, Position) . 
get_streain__position (Stream, Position) : - 

stream_position (Stream, Position) . 



This example demonstrates how the code readability can be ameliorated by performing 
a series of relatively simple transformation steps. We have seen that some of these steps 
required user's input. Clearly the changes can be performed manually. However, refac- 
toring browsers should be able to guarantee consistency, correctness and furthermore can 
automatically single out opportunities for refactoring. 

Techniques applied above are well-suited for local code improvement, i.e., the objects 
modified are predicates and clauses. In the next section we also consider techniques for 
global code restructuring such as duplicate predicates removal (Section l4n i. 



4 A Catalogue of Prolog refactorings 

In this section we present the refactorings that we have found to be useful for Prolog 
programs. The considered Prolog programs are not limited to pure logic programs, but 
may contain various built-ins such as those defined in the ISO standard (1995\. The only 
exception are higher-order constructs that are not dealt with automatically, but manually. 
This is done due to the fact that higher order constructs such as call make it impossible 
to decide at the compile-time which predicate is going to be called at the corresponding 
program point during execution. Automating the detection and handling of higher-order 
predicates is an important part of future work. 
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The refactorings in this catalogue are grouped by their scope. The scope expresses the 
user-selected target of a particular refactoring. Hence, refactoring starts by choosing an 
object in the specified scope. For instance, split module (Section I4r2b starts with selecting 
a module. Then the object is transformed. For us, this means that the module is split. 
Finally, the changes propagate to the affected code outside the selected scope. The latter 
might happen when there is a dependency outside the scope. This corresponds to updating 
import declarations in other modules of the system. 

For Prolog programs we distinguish the following four scopes, based on the code units of 
Prolog: system scope (Section |4]TJ, module scope (Section |4!2l i, predicate scope (Section 
14.31 1 and clause scope (Section l4!4l i. 

As a starting point for this catalogue we used Fowler's (I2003I I for object-oriented lan- 
guages. We selected those with clear Prolog counterparts, extended the list with Prolog- 
specific transformations and some well-known program transformations, such as dead code 
elimination. 

In the current technical note we only include a short summary of the refactorings here 
and refer to the companion technical report ( |Schrijvers et al. 2003| l. This report contains 
the full catalogue with detailed description of the refactorings, examples, preconditions 
and automatization techniques. 

4.1 System Scope Refactorings 

The system scope encompasses the entire code base. The user wants to consider the system 
as a whole. 

4.1.1 Eliminate explicit module qualiflcation 

In many Prolog systems, such as Quintus ( [Intelligent Systems Laboratory 2003a] l, the 
module system is non-strict, i.e. the normal visibility rules can be overridden by a special 
construct, called explicit module qualification and written as m : q , where m is a module that 
contains definition of the predicate q/0. The refactoring proposed adds import and export 
declarations to get rid of these special syntax constructions. By forcing the code to conform 
to a strict module system a number of quality characteristics are improved. First of all, a 
strict module system better expresses the idea of information hiding, which is important 
for software maintainability and readability dParnas 1972l l. Moreover, since not all Prolog 
systems support the above construct, code portability is improved. 

4.1.2 Extract common code into predicates 

This refactoring looks for common functionality across the system and extracts it into 
new predicates. The common functionality consists of identical subsequences of goals that 
are called in different predicate bodies, and extracts them into new predicates. The overall 
readability of the program improves as the affected predicate bodies get shorter, and the 
calls to the new predicates can be more meaningful than what they replace. Moreover the 
increased sharing simplifies maintenance as now only one copy needs to be modified. 

The problem of identifying identical subsequences of of goals is related to determining 
longest repeated subsequences dCrow and Smith 1992MPitkow and Pirolh 1999l l. 



4.1.3 Hide predicates 
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This refactoring removes export declarations for predicates that are not imported in any 
other module. It simplifies the program by reducing the number of entry points into mod- 
ules and hence the intermodule dependencies. 

4.1.4 Remove dead code 

Dead code is code that can never be executed and therefore can be safely eliminated 
without affecting correctness of the execution. Dead code elimination is sometimes per- 
formed in compilers for efficiency reasons, but it is also useful for developers: dead code 
clutters the program. We consider a predicate definition as the unit of dead code. 

4.1.5 Remove duplicate predicates 

Predicate duplication or cloning is a well-known problem, prominently caused by "copy 
& paste" and unawareness of available libraries and exported predicates in other modules. 
The main problem with duplication is its bad maintainability. It is up to the user to decide 
whether to throw away some of the duplicates and to use one of the remaining definitions 
instead or to replace all the duplicate predicates by a new version in a new module. 

4.1.6 Rename functor 

This refactoring renames a term functor across the system. If the functor has several 
different meanings and only one should be renamed, it is up to the user to identify what 
occurrence corresponds with what meaning. 



4.2 Module Scope Refactorings 

The module scope considers a particular module. Usually a module is implementing a 
well-defined functionality and is typically contained in one file. 

4.2.1 Merge modules 

Merging several modules into one can be advantageous in case of strong interdepen- 
dency of the modules involved. Moreover, merging existing modules and splitting the re- 
sulting module can lead to an improved module structure. 

4.2.2 Remove dead intra-module code 

Similar to dead code removal for an entire system (see Section 14.1b . this refactoring 
works at the level of a single module. It is useful for incomplete systems or library modules 
with an unknown number of uses. Recall that determining the liveness of the code requires 
knowledge of top-level predicates. In the case of intra-module dead code elimination, the 
set of top level predicates is extended with, or replaced by, the exported predicates of the 
module. 

4.2.3 Rename module 

This refactoring applies when the name of the module no longer corresponds to the 
functionality it implements e.g. due to other refactorings. 



4.2.4 Split module 
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The refactoring is useful to split unrelated parts of a module or make a large module 
more manageable. 

Moores (IMoores 19981 1 has shown that the number of user-defined predicates correlates 
with the number of errors detected. Based on an empirical study he suggested a threshold 
of around 35 ± 5 predicates per program. While this is hardly reasonable as a requirement 
for an entire Prolog system, trespassing the threshold should be used as a guideline when 
the Split Module refactoring can be applied. 



4.3 Predicate Scope Refactorings 

The predicate scope targets a single predicate. The code that depends on the predicate may 
need updating as well. But this is considered an implication of the refactoring of which 
either the user is alerted or the necessary transformations are performed automatically. 

4.3.1 Add argument 

This refactoring should be applied when a callee needs more information from its (direct 
or indirect) caller, which is very common in Prolog program development. Given a variable 
in the body of the caller and the name of the callee, the refactoring browser should prop- 
agate this variable along all possible computation paths from the caller to the callee. This 
refactoring is an important preliminary step preceding additional functionaUty integration 
or efficiency improvement. 

4.3.2 Move predicate 

This refactoring moves a predicate definition from one module to another It can improve 
the overall structure of the program by bringing together interdependent or related predi- 
cates, hence improving both cohesion of each one of the modules involved, and coupling 
of the pair. Move predicate appears often after predicate extraction, i.e., extract common 
code or extract predicate locally, discussed in Sections [4.11 and l4~4l respectively. 

4.3.3 Rename predicate 

This refactoring can improve readability and should be applied when the name of a 
predicate does not reveal its purpose. 

4.3.4 Reorder arguments 

Our experience suggests that while writing predicate definitions Prolog programmers 
tend to begin with the input arguments and to end with the output arguments. This habit 
has been identified as a good practice and even further refined by O'Keefe ( |1994| l to more 
elaborate rules. Unfortunately, this practice is difficult to maintain when additional argu- 
ments are added later. We observed that failure to confirm to this "input first output last" 
expectation pattern is experienced as very confusing. 

4.3.5 Specialize predicate 

By specializing a predicate we mean producing a (number of) more specific version(s) 
of a given predicate provided some knowledge on the intended uses of the predicate. Spe- 
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cialisation can simplify code as well as make a meaningful distinction between different 
uses of a predicate. 

4.3.6 Remove redundant arguments 

The basic intuition here is that parameters that are no longer used by a predicate should 
be dropped. It improves readability. 

Leuschel and S0rensen (|l996l l established that the redundancy property is undecid- 
able and suggested two techniques to find safe and effective approximations: top-down 
goal-oriented RAF (Redundant Argument Filtering) and bottom-up goal-independent FAR 
(RAF "upside-down"). In the context of refactoring FAR is the more useful technique, 
since only FAR deals correctly with exported predicates used in unknown goals. 

4.4 Clause Scope Refactorings 

The clause scope affects a single clause in a predicate. Usually, this does not affect any 
code outside the clause directly. 

4.4.1 Extract predicate locally 

This refactoring is similar to the system-scope refactoring with the same name. However, 
it does not aim to automatically discover useful candidates for replacement. The user is 
responsible for selecting the subgoal that should be extracted, in order to improve the 
readability. 

4.4.2 Invert if-then-else 

The order of "then" and "else" branches can be important for code readability. To en- 
hance readability it might be worthwhile putting the shorter branch as "then" and the longer 
one as "else". Alternatively, the negation of the condition may be more readable because, 
for example, a double negation can be eliminated. 

4.4.3 Replace cut by if-then-else 

This technique aims at improving program readability by replacing cuts (!) by the more 
declarative if-then-else (-> ; ). More detailed discussion on replacing cut by if-then-else 
is deferred to Related work and extensions. 

4.4.4 Replace unification by (in)equality test 

Often full unifications are used instead of equality or other tests. O'Keefe in (I1994I I 
advocates the importance of steadfast code. Recall, that steadfast code produces the right 
answers for all possible modes and inputs. A more moderate approach is to write code that 
works for the intended mode only. Unification succeeds in several modes and so does not 
convey a particular intended mode. Equality (==, =:=) and inequality (\==, =\=) checks 
usually only succeed for one particular mode and fail or raise an error for other modes. 
Hence their presence makes it easier in the code and at runtime to see the intended mode. 
Moreover, if only a comparison was intended, then full unification may lead to unwanted 
behaviour in unforeseen cases. 

4.4.5 Produce output after commit 
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This refactoring addresses a similar issue as the previous one. Producing output before 
the commit (cut) does not properly convey the intended mode of a predicate. Moreover it 
may lead to unexpected results when used in the wrong mode. 



5 The viPReSS refactoring browser 

The refactoring techniques presented in Section|4]have been implemented in the prototype 
refactoring browser ViPReSjU. It has been implemented on the basis of VIM, a popular 
clone of the well-known VI editor. The text editing facilities of VIM make it easy to im- 
plement techniques like move predicate (Section l43T l. 

Most of the refactoring tasks have been implemented as SICStus Prolog ( [Intelligent Systems Laboratory 2003b| ) 
programs inspecting source files and/or call graphs. Updates to files have been imple- 
mented either directly in the scripting language of VIM or, when many files need updating 
at once, through ed scripts. VIM functions were written to initiate the refactoring s and to 
get user input. 

ViPReSS has been successfully applied to a large (more than 53 KLOC) legacy system 
used at the Computer Science department of the Katholieke Universiteit Leuven to manage 
the educational activities. The system, called BTW, has been developed and extended since 
the early eighties by more than ten programmers, many of whom are no longer employed 
by the department. The implementation has been done in MasterProLog (IIT Masters 2000l l. 
which is no longer supported. Therefore, preparing the code for migration to a more mod- 
ern Prolog dialect and general structure improvement were essential for further evolution 
of the system. 

By using the refactoring techniques we succeeded in obtaining a better understanding of 
this real-world system, in improving its structure and maintainability, and in preparing it 
for intended changes: porting it to a state-of-the-art Prolog system and adapting it to new 
educational tasks the department is facing as a part of the unified Bachelor-Master system 
in Europe. 

A preliminary study revealed that many modules were unused. We brought in an expert 
to help us identify the bulk of these unused modules, including out-of-fashion user inter- 
faces and outdated versions of program files. This reduced the system size to a mere 20,000 
lines. 

Next, the actual refactoring process was started. As the first phase we applied system- 
scope refactorings. ViPReSS was used to clean up after the bulk dead code removal: 299 
predicates in the remaining modules were identified as dead. This reduced the size by 
another 1,500 lines. Moreover ViPReSS discovered 79 pairwise identical predicates. In 
most of the cases, identical predicates were moved to new modules used by the original 
ones. The previous steps allowed us to improve the overall structure of the program by 
reducing the number of files from 294 to 1 16 with a total of 18,000 lines. Very little time 
was spent to bring the system into this state. The experts were sufficiently familiar with the 
system to identify obsolete parts. The system-scope refactorings took only a few minutes 

^ Vi(m) P(rolog) Re(factoring) (by) S(chrijvers) (and) S(erebrenik) 
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each. During this phase most of the work has been done by ViPReSS, while the user's 
involvement was limited to choosing a way to deal with duplicate predicates. 

The second phase of refactoring consisted of a thorough code inspection aimed at local 
improvement. Many malpractices were identified: excessive use of cut (Section l4~4l i com- 
bined with output construction before commit (Section 14.4b being the most notable one. 
Additional "bad smells" discovered include bad predicate names such as q, unused argu- 
ments and unifications instead of identity checks or numerical equalities (Sections 14. 3| and 
14.41 respectively). Some of these were located by ViPReSS , others were recognised by the 
users, while ViPReSS performed the corresponding transformations. This step is more de- 
manding of the user. She has to consider all potential candidates for refactoring separately 
and decide on what transformations apply. Hence, the lion's share of the refactoring time 
is spent on these local changes. 

In summary, from the case study we learned that automatic support for refactoring tech- 
niques is essential and that ViPReSS is well-suited for this task. As the result of applying 
refactoring to BTW we obtained better-structured lumber-free code. Now it is not only more 
readable and understandable but it also simplifies implementing the intended changes. 
From our experience with refactoring this large legacy system and the relative time in- 
vestments of the global and the local refactorings, we recommend starting out with the 
global ones and then selectively apply local refactorings as the need occurs. 

The current version of ViPReSScan be downloaded from 
http : / /www. cs . kuleuven . ac .be/^toms/vipress. 



6 Conclusions 

In this paper we have studied refactoring techniques for Prolog. Firstly, we have shown 
that refactoring is a viable technique for Prolog and that many of the existing techniques 
developed for refactoring in general are applicable. Our refactoring catalogue contains 
many such refactorings. 

Secondly, Prolog-specific refactorings are possible and the application of some general 
techniques may be highly specialized towards Prolog. In this context, the companion tech- 



nical report ( Schrijvers et al. 2003 1 shows how refactoring fits in with existing work on 
program analysis and transformation in the context of Prolog and how many of these ex- 
isting techniques may be adapted for the purpose of partially automating the refactoring 
process. Also, ViPReSS, our refactoring browser integrates several automatable parts of 
the presented refactorings in the VIM editor. 

Finally, it should be clear that refactoring Prolog programs is not just viable but very 
useful for the maintenance of Prolog programs. Refactoring helps bridge the gap between 
prototypes and real-world applications. Indeed, extending a prototype to provide additional 
functionality often leads to cumbersome code. Refactoring allows software developers both 
to clean up code after changes and to prepare code for future changes. These are important 
benefits that also apply to logic programming. 

As completeness of the catalogue is clearly not possible, we aimed to show a wide range 
of possibilities for future work on combining the formal techniques of program analysis 
and transformation with software engineering. Throughout the catalogue many specific 
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issues for future work have been mentioned. Below we list related work and more general 
challenges for the future. 

6.1 Related and Future Work 

Logic programming has often been used to implement refactorings for other languages, 
e.g. a meta-logic very similar to Prolog is used to detect, for instance, obsolete parameters 
in jTourwe and Mens 2003l l. 

Seipel et aJ. ( I2003I I include refactoring among the analysis and visualization techniques 
that can be easily implemented by means of FnQuery, a Prolog-inspired query language 
for XML. However, the discussion stays at the level of an example. The M.Sc. thesis of 
Steinke (i2003i was dedicated to refactoring of logic programs. A Catalogue of refactor- 
ings has been composed and a prototype system has been implemented. However, only 
predicate-scope refactorings have been considered and only the transformation step has 
been implemented. 

In the logic programming community questions related to refactoring have been inten- 
sively studied in the context of program transformation and specialisation. There are two 
important differences with this line of work. Firstly, refactoring improves readability, main- 
tainability and extensibility rather than performance. Secondly, for refactoring user input is 
essential while in the mentioned literature strictly automatic approaches were considered. 
However, some of the transformations developed for program optimization, e.g. dead code 
elimination, can be considered as refactorings and have an important function in refactor- 
ing browsers. 

To further increase the level of automation of particular refactorings additional informa- 
tion such as types and modes can be used. 

Future refactoring tools can also benefit from integration with Prolog development en- 
vironments. Modern Prolog systems are often equipped with features extending the ISO 
Standard such as constraint solving over different domains and Constraint Handling Rules, 
coroutining, interfaces to foreign languages, GUI-development systems and databases. In 
most of the cases, the refactoring techniques described above can still be applied to im- 
prove the code. Certain refactorings may be specially designed for particular extensions. 
For instance, our experience suggests that simplifying primitive constraints may be useful 
in the case of CLP. 
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